AITopics | objective discovered online

Meta-Gradient Reinforcement Learning with an Objective Discovered Online

Neural Information Processing SystemsDec-24-2025, 10:52:56 GMT

Deep reinforcement learning includes a broad family of algorithms that parameterise an internal representation, such as a value function or policy, by a deep neural network. Each algorithm optimises its parameters with respect to an objective, such as Q-learning or policy gradient, that defines its semantics. In this work, we propose an algorithm based on meta-gradient descent that discovers its own objective, flexibly parameterised by a deep neural network, solely from interactive experience with its environment. Over time, this allows the agent to learn how to learn increasingly effectively. Furthermore, because the objective is discovered online, it can adapt to changes over time. We demonstrate that the algorithm discovers how to address several important issues in RL, such as bootstrapping, non-stationarity, and off-policy learning. On the Atari Learning Environment, the meta-gradient algorithm adapts over time to learn with greater efficiency, eventually outperforming the median score of a strong actor-critic baseline.

meta-gradient reinforcement learning, name change, objective discovered online, (4 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Meta-Gradient Reinforcement Learning with an Objective Discovered Online

Neural Information Processing SystemsMay-27-2025, 09:07:47 GMT

Deep reinforcement learning includes a broad family of algorithms that parameterise an internal representation, such as a value function or policy, by a deep neural network. Each algorithm optimises its parameters with respect to an objective, such as Q-learning or policy gradient, that defines its semantics. In this work, we propose an algorithm based on meta-gradient descent that discovers its own objective, flexibly parameterised by a deep neural network, solely from interactive experience with its environment. Over time, this allows the agent to learn how to learn increasingly effectively. Furthermore, because the objective is discovered online, it can adapt to changes over time.

artificial intelligence, machine learning, meta-gradient reinforcement learning, (3 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Review for NeurIPS paper: Meta-Gradient Reinforcement Learning with an Objective Discovered Online

Neural Information Processing SystemsJan-27-2025, 14:31:58 GMT

Strengths: The idea of formulating the inner loss for meta RL as learning from the objective discovered by its own is interesting and novel. Generally, defining the algorithm to self-discover its objective makes the learning algorithm moves one step closer towards developing automated machine intelligence compared to the conventional meta RL methods which greatly rely on expert's design choice such as the hyperparameter to perform learning-to-learn. The authors present extensive experiment results to evaluate the proposed method. The proposed method has been evaluated on three task domains: a catch game to demonstrate the method could effectively learn bootstrapping, a 5-state random walk to demonstrate the method works in non-stationary environments, and ALE which is a large-scale RL testbed. In all the task domains, the proposed method achieves noticeable performance improvement over the compared baselines.

meta-gradient reinforcement learning, neurips paper, objective discovered online, (3 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.40)

Add feedback

Review for NeurIPS paper: Meta-Gradient Reinforcement Learning with an Objective Discovered Online

Neural Information Processing SystemsJan-27-2025, 14:31:51 GMT

The reviewers agreed that this is an interesting, novel, and well-executed contribution. I would like to bring up two issues that were raised in the discussion, and ask the authors to address them in their final version. This should at least be mentioned/discussed.

meta-gradient reinforcement learning, neurips paper, objective discovered online

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.40)

Add feedback

Meta-Gradient Reinforcement Learning with an Objective Discovered Online

Neural Information Processing SystemsOct-11-2024, 02:56:33 GMT

Deep reinforcement learning includes a broad family of algorithms that parameterise an internal representation, such as a value function or policy, by a deep neural network. Each algorithm optimises its parameters with respect to an objective, such as Q-learning or policy gradient, that defines its semantics. In this work, we propose an algorithm based on meta-gradient descent that discovers its own objective, flexibly parameterised by a deep neural network, solely from interactive experience with its environment. Over time, this allows the agent to learn how to learn increasingly effectively. Furthermore, because the objective is discovered online, it can adapt to changes over time.

deep neural network, meta-gradient reinforcement learning, objective discovered online, (1 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Filters

Collaborating Authors

objective discovered online

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

Meta-Gradient Reinforcement Learning with an Objective Discovered Online

Meta-Gradient Reinforcement Learning with an Objective Discovered Online

Review for NeurIPS paper: Meta-Gradient Reinforcement Learning with an Objective Discovered Online

Review for NeurIPS paper: Meta-Gradient Reinforcement Learning with an Objective Discovered Online

Meta-Gradient Reinforcement Learning with an Objective Discovered Online